Two Theorems on the Accuracy of 

Numerical Solutions of Systems of 

Ordinary Differential Equations 

By I. W. SANDBERG 

(Manuscript received March 13, 1967) 

We consider the accuracy with which a numerical solution of the system of 
ordinary differential equations 

x + f(x, t) = 0, t^O 

can be obtained by the use of a numerical integration formula of the well- 
known type 

V»+i = £ «*?/»-*• I^Z b k y' n _ k . 

*-0 *=-l 

For the scalar case, under some natural assumptions, and assuming that 
a and are real constants such that 

.^«^, /so 

ox 
at every point x, it is proved that if 

m = i - Z a^- a+i » + k« + m Z b*r (k+i) ^ o 

*-0 A- = -l 

for all \z\ — 1, then (e), the root-mean-squared error over a given interval, 
between the true samples of x(t) and the y„, satisfies 

(e) £ (1 + p)" 1 min | F(e") I' 1 (*>) 

in which p depends on a, 0, the a k , and the b k , and (<p) takes into account 
the local roundoff and truncation errors as well as errors in the starting 
values for computing the y n . 

IS the condition on F{z) stated above holds, and if p < 1, then 

(e) £ (1 - p)- 1 max | F(0 I" 1 < fP >. 
1243 
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The significance of the key assumptions is discussed and two examples 
are given. 

I. INTRODUCTION 

In this paper we present some results concerning the accuracy with 
which a numerical solution of the system of ordinary differential 
equations 

x + f(x, = 0, t^O [x(0) = * ] (1) 

can be obtained by the use of a numerical integration formula of the 
well-known type 1 

?/n + l = Z VkVn-k + h S Wn-k , U^p. (2) 

In (2) the y n are approximations to the x n = x(nh), where h, a positive 
number, is the step size parameter; y , y x , • • • , y, are starting vectors, 
the last p of which are obtained by an independent method; and 

If b_i v^= 0, then y n+1 is denned implicitly, and (2) is said to be of 
closed type. It is assumed throughout that (2) can be solved* for 
y n+ i for all n ^ p. Specializations of (2) include, for example, Euler's 
method : 

y n+ i = Vn + hy' n , 
and the more useful formula 

y«+i = y n + \h(y' n + y' n+l ). 

It is assumed throughout that for t ^ 0, f{x, t) is a well-defined real 
JV-vector-valued function defined in the set of all real iV-vectors x, 
that f(x, t) satisfies (the usual weak) conditions which guarantee the 
existence and uniqueness of a solution of (1), and that the Jacobian 
matrix Bf(x, t)/Bx exists for all x and all t i= 0. 

Equation (2) ignores the roundoff error R n introduced in calculating 
2/n+i , and, in order to take R n into account, we shall consider instead 

* It is well known that if / satisfies a uniform Lipshitz condition, and if h is 
sufficiently small, then (2) possesses a unique solution y„+i which can be ob- 
tained by a simple iterative process. 1 - 3 However, this smallness condition is by no 
means always necessary. For example, if b-x > and, with a as defined in Sec- 
tion 2.3, if a > 0, then for any h > a unique solution yn+i exists and can be 
computed by an iterative process which is only slightly more complicated than 
the usual one (see Section IV) . 
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of (2) : 

V V 

y n+ i = X a*y.-k + h 2 &#£-* + #» , n ^ p. (3) 

*-0 * 1 

We may also assume that R n takes into account the error in solving 
(2) for y n +i , caused typically by truncating an iteration procedure 
after a finite number of steps. Predictor-corrector techniques can of 
course be viewed as giving rise to a degenerate (often one-step) 
iteration technique in which the "starting point" is generated by the 
predictor. 

The truncation error T„ , a basic entity associated with the integra- 
tion formula (2) and the differential equation (1), is defined for 
n ^ p by the relation 

p p 

2 n+ i = 2 a k x n - k + h J2 M»-* + T n , n ^ p 

in which x' n = —f(x n , nh). Clearly, T n is a measure of how well the 
samples x n _ p , x n - v+i , • • • , x„ +l of the solution of (1) satisfy the integra- 
tion formula. The problem of estimating T n is a classical one, and there 
are standard methods which lead to precise bounds. 1,2 

We now define a set of vectors {?„} which plays a central role in the 
subsequent discussion: 

<p n = T n - R„, ji^p (5) 

<p n = (x n+i — y n+] ) — 2 «*&.-* - y n -k) 

fc-0 

+ h , £ M/fr»-* ■ (n - k)h] - 1[y«-> , (n - AM), 

(6) 
n = -1,0, ••• ,(p - 1) 

(with the understanding that x n = y„ = f(x„ , nh) = f(y n , nh) = 
for n < 0). Note that the <p n for n = — 1, 0, • • • , (p — 1) are measures 
of the departures of the starting vectors from the exact values, and that 
if n = for n = — 1, 0, • • • , (p — 1) if the starting vectors are exact. 
We shall describe our results first for the scalar case (i.e., for 
N = 1). 

II. RESULTS 

Let e„ denote (x„ — y n ), the difference between x(nh) and its com- 
puted approximation. Suppose that A r = 1, and that a and £ are real 
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constants such that 

a £ ^ S i> (7) 

— dx 

for all t ^ (at every point x) , and that 

F(z) 4 1-2 <M" (1+11 + K« + 0A Z M~ ( * +1) * (8) 

fc-0 *— i 

for all | z | ^ 1 (including "z = 05"). We prove that then 

< e )4 ((M + l)- 1 Z| CM rY, (9) 

\ m-0 I 

the root-mean-squared value of the first (M + 1) error terms [M is an 
arbitrary positive integer greater than or equal to (p + 1) ] is bounded 
from below in terms of 

<„> 4 ((J* + 1)- £ 1 *_, i 2 )\ do) 

\ t;i=n ' 

the corresponding quantity for the f'a, in accordance with the inequality 
(e) * (1 + p)" 1 min I F(e*") | _1 (*>>. (11) 

0SuS2ir 

[FU) is defined in (8)], in which 

£ b k e- i(k+i >" 



p — h(0 — a )h max 



F(e iu ) 



(12) 



"We also prove that if in addition to the assumptions stated above, 
we have P < 1, then the sequence {e„} is bounded (i.e., there exists a 
positive constant c such that | e n | ^ c for all n ^ 0) whenever the 
sequence {y„_i} is bounded, and 

<e> £ (1 - p)" 1 max | F(e ,u ) p 1 (*>) (13) 

0SuS2r 

(whether or not {p„_i} is bounded). 

Inequality (11) provides a limitation on obtainable accuracy under 
essentially only the weak assumption that the sequence of approxima- 
tions {y„} defined by (2) approaches zero as n -> « for all sets of starting 
values when f(x, t) = |(a + 0)z. Since, by assumption, F(e iu ) 9* for 
^ w ^ 2t, it is clear that p < « . 
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The condition that p < 1 is satisfied if and only if the locus of 

V 

Elku -io, 

a k e — e 



0(«) = 



Z b#" 



(14) 



for ^ o> ^ 2tt lies outside the "critical circle" C of radius |(j8 — a)h 
centered in the complex plane at \Ua+/3)h, 0] (see Fig. 1).* 




Since 



Fig. 1 — Location of the critical circle C (for N = 1). 



p = U0 ~ a)h{mm | 0(a>) - i(« + fi)h |} -1 , 



(15) 



we see that P is the ratio of the radius of C to the distance between 
c and 6, where c is the center of C and 6 is a point nearest c on the 

loCUS Of 0(d)). 

2.1 Discussion 

The quantity (e) of course of interest in problems in which we 
are concerned with a measure of the accuracy of a numerical solution 

* If a > 0, we can express both the conditions that p < 1 and F(z) ^ for \z\ ^ 1 
entirely in terms of a condition on the locus of 0(co) _1 for w e [0, 2jt], The requirements 
on F{z)_ and p are met if the disk of radius \[(ah)- 1 — (/3/i) -1 ] centered at 
| H(ah) l + {fih) »], 0| is not "encircled" or intersected by the locus of ©(«)-». There 
ib a complication that arises as a result of the fact that 0(oj) -1 is typically not bounded. 

This comnlicnt.inn nftpn sterns frnm n 'W\naintnT>mr -mr,n',~r*m n ni'' ,.,U^„U ; i:__ j.u_i 




zeros on the unit circle, a fact that can be used to suitably define what is meant by 
the locus of 0(w) > not encircling the disk. We leave the details of the necessary 
' principle of the argument" argument to the sufficiently interested reader. 
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over a large number of steps, as opposed to the accuracy of some final 
value obtained at the end of a large number of steps. 

Although there is a vast and interesting literature concerned with 
various aspects of the problem of error estimation in digital computa- 
tion (see, for example, Refs. 3, 4, and 5) , the results presented above, 
and their proofs, appear to be most closely related to earlier results 
concerning the input-output stability of continuous-time nonlinear 
feedback systems. 6 -* Indeed, the writer is not aware of any lower- 
bound results in the numerical analysis literature of the type described 
above. However, some upper bounds concerning (2) of (for example) 
the form |e«| ^ K with K independent of n (which imply <e) ^ K) 
have been obtained in certain cases. In this connection, our condition 
that guarantees the boundedness of {e„} is often weaker, and our upper 
bounds on (e) are often much stronger, because, for example, the ym-i 
can become very small as m becomes large. 

Our approach can be applied to several other problems in numerical 
analysis. In particular, with reasonably direct modifications of our 
proofs, analogous theorems can be proved concerning the numerical 
integration of systems of second-order ordinary differential equations. 

2.2 Examples 

Euler's Method: y n ^ = y n + hy' n 

Here F(z) = 1 - [1 - K* + P)W\ so that F ( 2 ) ^ for | z | ^ 1 if and 
only if < *(« + ffl < 2, and | F(e f '")J - 1 1 - [1 - K« + fi)h]e~ iu |. 
Thus (with < K« + P)h < 2 )> 

min | F(e iu ) |"i = [1 + | 1 - K« + P)h W 

u 

max | F(e iu ) | _1 = [1 - | 1 - K« + P)h \]'\ 

The locus of is the circle shown in Fig. 2, since @(w) = 1 - e~ ,u . If 
ah > and /3A < 2, then the critical disk (Fig. 2) is not intersected 
by the locus of 0, the condition that < K« + &) h < 2 is satisfied, 
p < 1, and in accordance with the last paragraph of the section preceding 
Section 2.1: 

p = 1(0 - a )h max ([|(/3 + a)hT l , [2 - W + a)^]" 1 )- 



* It is interesting to note that the possibility of exploiting feedback-theoretic 
ideas has been emphasized by Hamming. 2 
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Fig. 2 — The locus of 0(w) for Euler's method, and the critical circle C. 

If < (a + p)h < 2, then 

min | F(e iu ) | _1 = [2 - h(a + 0)h]~' , 

max | FOr) |- - [i(a + p)h]-\ 

u 

(3 — a 

P = 

and 



+ «' 



<c> ^ [1 + 03 " «) 08 + a)" 1 ]" 1 ^ ~ l(a + W*fr> 

<e> ^ [1 ~ 08 - a) 03 + aO'TEK" + fl*]"^). 

For estimates of T n , see Ref. 1 or 2. 

As a remark concerning the necessity of the condition p < 1, we note 
that if ah > 0, but /3h > 2, then for even the special case in which 

f(x, t) = 0x, we have e , e x , e 2 , • • ■ unbounded, since y , y A , y 2 , • • • is 
unbounded (assuming merely that y 5^ 0). 

The Formula y n+l = y n + \h{y' n + y' n+ \)'- 

In this important case 

F(z) = 1 + \{ a + 0)h - [1 - l(a + flAjr 1 , and 

We have F(«) ¥° for |z| S 1 if and only if {a + p)h > 0, and [as- 
suming that (a + {3)h > 0] : 
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min I F(e'") I -1 = [1 + |(« + 0)h + | 1 - K« + 0)A If 
max | F(e*") | _1 = [1 + |(« + ft* - I 1 - K« + 0)* IF 1 - 

The locus of lies entirely on the imaginary axis of the complex 
plane, 

g - a 

P ~ (* + *' 

and obviously p < 1 if a > 0. 
If « > and (« + /?)/i < 4, then 

min | F(e ia ) | _1 = § 



max | F(e iu ) \~ l = [i(a + 0A]" 



and 



<c> ^ [1 + 03 -*)(P + a)~T l iM 

(e) g [1 - 09 - a)08 + «)"']" [*(« + OT~»- 
The last inequality can be written as simply 

(e) ^ (ah)-\<p). 
For the integration formula under consideration, 

_ h 3 x'"(y„) 
J " 12 

where v „ lies in the interval [(« - p)h, (n + l)h}. 
Here p = 0, and 

p., = (.r - 2/ ) + *M/(*o , 0) - /(2/o , 0)1. 
Thus, assuming for the purpose of illustration that roundoff errors can 
be neglected: 

(M + 1)- I *>-. I 2 + (M + I)" 1 Z I • 12 ^- ; |j 

provided that « > and (a + fi)h < 4. If, for example, (£-<*)• 
(0 + a )-i - \ and i(a + j9)A = i, then the ratio of our upper bound 
on (e) to our lower bound is 18. If (j8 - o) (/8 + a)" 1 = $ and 
i(a + £)/* = i, then the ratio is 42. 
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2.3 Results jor the Vector Case (N ^ 1) 

We shall state our results for N ^ 1 in a slightly more formal fashion. 

Definitions: 

(i) Let || 3 || denote (Z*=i 9*) 5 for every real A^-vector q = 
($1,0.2, ■■■ , q N ). 

(ii) Let \df(x, t)/dx} s and \dj(x, t)/dx} A denote, respectively, the 
symmetric and antisymmetric (i.e., skew symmetric) part of d/(x, t)/dx, 
the Jacobian matrix of J(x, t). 

(in) Let F(z) 4 i _ £» =o a #**» + i (a + p)h £»__, ^" <fc+1) 

(«;) e„ = .r(n/i) - y n , n ^ with the ?/„ for n ^ (p + 1) denned by 
(3). 

(v) With M an arbitrary positive integer such that il/ ^ (p + 1), let 



<*>4 ((.1/ + D- Zlk.ll 2 )' 

> m-<> / 

<^)4 ((M + l)- 1 t\\<P~-> II 2 )' 



where the <?„,_, are denned in (5) and (6). 

Assumptions: 

Let the smallest eigenvalue of \df(x, t)/dx\ s be bounded from below 
by the real constant a(a > — qo ) for all t ^ 0, and let the largest eigen- 
value of {df(x, t)/dx\ s be bounded from above by the real constant 
/3(/3 < oo ) for all t ^ 0. Let the modulus of the largest eigenvalue of 
\dj{x, t)/dx) A be bounded from above by the real constant 7(7 < °o) 
for all t ^ 0. 

Definition: 
Let 

P 4 |i(/3 - a)h + 7/'] 



max 

OSwS2r 



E &,e-' ( * +1) " 



Theorem I : Ij 

(i) the assumptions 0} Section I concerning /(.r, /) and (2) are satisfied, 
(ii) 1 + |( a + 0)ft&_, ^ 0, 
(w) F(z) j* Ofor all I z I ^ 1, 
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then 

(e) > (1 + p) _1 min | F(e iu ) | _1 (<p). 



OStt <'!> 



Theorem 2: If (i), (ii), and (Hi) of Theorem 1 are satisfied, and if p < 1, 
then 

(i) (e) ^ (1 - p)" 1 max | F(e iu ) | _1 (<p), 

0gug2r 

and 

(ii) sup || <p m - x || < co implies sup || e m || < «> . 

Corollary to Theorem 1: If (i) 0/ Theorem 1 is satisfied and there 
exists at least one real constant fci such that 

1 - Z a*" tt+1) + M E M _( * +1) ^ 

/or aZZ I z I ^ i, tfien iftere ezisis a positive constant k 2) which depends 
only on ao , a x , • • • , a p , 6_i , 6 , ■ • •, b p , a, 0, and y swc/i i/iai 

(e) ^ /c 2 <y>). 

Theorems 1 and 2 are proved* in the following section. The proof 
of the corollary is very simple. 

Since 

fc-0 * — 1 

for all I 2 I ^ 1, there exists a fc{ such that 

! _ 2 0t3 -^» + k{h t, V**" ^ 

for all I z I i? 1, and 

1 + k[hb- x ^ 0. 

Choose a! and /3' such that a' ^ a, 0' ^ /3, and f(a' + 0') = fc{ . If we 
replace a and /3 with a' and /3', respectively, we see that Theorem 1 
applies. 



* Our proofs actually yield sharper, but less explicit, bounds on (e) than those 
of Theorems 1 and 2. See (31) and (37). 
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III. PROOFS 

Prooj of Theorem 1: 
We have 

2/ n+ i = 2 <ikVn-k + h 2 Mn-fc + -ffn , n t p 
and 

fc-0 fc— -1 

Thus, 

?' 

*-o 

- ^ Z M/[z„-*(n - fc)fc] - j[y n _ k , ( n - k)h]} + <p n , n^p 
and, with <f„ denned for n ■= -1,0, • • • , (p - 1) by (6) , 

V V 

e„ = J2 arfn-i-k ~ h 2 M/[s«-i-* , (n — 1 — &)/*] 

*=0 A--1 

- f[2/»-i-* , (n - 1 - h)h]\ + *>„_, (16) 

for n ^ 0. 

As a matter of convenience we define a_ x = 0. 

Lemma 1: There exist real sequences {w k }? =0 and {v k \?_ n both belonging 
to l t (i.e., with the property that 2"- I w >= I < °° and ^t-o \v k \ < oo) 

-h Z b# lk+1) 

W(z) 4 X w ,, 2 -* = - —j S=! (17) 

1 - E [a, - }(a + 0)hb k \T ik * l) 

k = -l 

V(z) 4 £ Vk ?-* = - 1 (18) 

1 - Z [«h - *(« + /3)A& t ]T { * +I) 



* 



/or <dl | g | ^ 1. 

The proof follows at once from the standard theory of z-transforms, 
in view of assumptions (it) and (Hi) of Theorem 1. The details are 
omitted. 
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Lemma 2: Let 8, = f(x n , nh) - f(y n , nh) - J(a + &){x, - y„) for all 
n ^ 0, and let {w k } and {v k } be as described in Lemma 1. Then 



e n = Zw n - k S k + !>,.-*>*-, (19) 

ft-0 *-o 

for n = 0, 1, •••,i¥. 
Proof of Lemma 2: 

From (16) and our definition of 8„: 

e n = Z [^ - *(« + 0)hb k \e n - k - x 

k— i (20) 

- fe Z & * «-*-' + *-» ' n = °- 

We multiply both sides of (20) by e~ inu and then sum from n = 
ton = M to obtain 

M v " 

lf=o *— i 

A=-l n-0 »-0 

for all o, c [0, 2tt] . Using e„ = 8„ = for w < 0, we have 

Z ■-'*"«-*-. - a"" 1 *"" Z a'""-. " •" <<1+M " £•"•""*. (22) 



and 

f>-<« *„_,_, = ,-'«♦«• £«-*- 8„ - r ,<,+w " I [ •- 1 - «- • (23) 



n-0 



Thus, from (21), (22), and (23) 

Af Af M 

jf=o »-o »-» 

+ 7(«'-){a Z &*-" x+w " Z e_, '" u 5 » 

- £ [4. - *(« + «M*-' a+W " t e- in °*n\ (24) 

fr = -l H--V-* 

for all co £ [0, 2tt] . 
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The expression within the braces in (24) can be written as 

n=l) 

with s n = for n = 0, 1, • • ■, M and for n > (1 + M + p)- Since 
{*>*! £ h , we have 

°0 °0 " 

V(e iu ) £s n e~'" u = 2Z e_ '" w 2 y «-* s * • 

R-0 n=0 fc-0 

Similarly, 



V(e lu ) 2Z e ""V„-i = ^e Z) y n -*(v*-i)A/ 

n-0 ;i = A— 

in which 

(<Pk-i)at = p*-i , & ^ M 

= A; > M 
and finally, since {w k } e li , 



n=0 *- = o 



where 



Thus, 



= 0, & > M. 



n-0 n=0 *=o 

00 n 



,=0 t-0 



n = *-=o 



for all o) t [0, 2tt]. Since 
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for n = 0, 1, • • • , M we have* 

e n = £ ml-* &k + 12 v n -*(Pk-i (25) 

k = k-0 

for n = 0, 1, • • • , M . This completes the proof of Lemma 2. 
Lemma 8: If (19) holds for n = 0, 1, • • •, Jlf, Men 



2 ?•„_*¥>*- 1 

fc-0 



2 \J 



(Jl/ \\ / if \ n 

zikii 2 +(z E«.-*« 
n=0 / \n=0 I *-0 



lA* 



(26) 



and 
u 



(27) 



I M \ \ I M I n \\2\i I Ml | | H 2\ J 

(Zll'-ll 2 ) ^(Z I>.-*M ) + Z Ewm ). 

\n-0 / \n = II 4 = II ' \n-0 II t-0 ' 

Proof of Lemma 3: 

Inequality (26) or (27) follows from (19) by two applications of 
Minkowski's inequality. We leave the details to the reader. 
Inequality (27) is used only in the proof of Theorem 2. 

Lemma 4- 



k-0 

Proof of Lemma 4-' 
By Parseval's identity, 



^ max | W(e ia ) | 2 £ || 8 n || 2 . 



u 

Z 

n-0 



£ «>«-* 5 * 



in which 



Therefore, 



Mr 

27T Jo 


| n=0 k-0 


-k Sk 


( 


2tt Jo 


M n 

| 71 = <t = 


-ki&kjAf 


(5*)m = S k , 


fc ^ M 




= o, 


k > M. 







£ w »-* 5 < 



■^T Jo I n = t-0 



du. 



* We could have obtained (25) from (20) directly using standard z-transform 
theory, if we had introduced further assumptions which guarantee that the se- 
quences {y n }, {x(nh)}, and {<p*-i} are transformable. However, this would have 
complicated the statement of our results and would have weakened them in a non- 
trivial manner. 
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But since {w k } e h, 

oo n no M 

2 e" ,un 2 «>n-*(S*)jtf = J] e~ <Mn w„ J2 e~ iak S k 

n-0 * = n = * = 

= W(e ia ) TV*"* 5. . 
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Thus, 






1 Z* 2 ' I I M 

^" J I I Jfc=0 

^ max \W(e <m )\*±- T II £•"'"*«. 

0Sug2T ^7T J || fc = 



rfw 



^ max | W(e im ) | 2 £ || 8 k || 2 

0SuS2i fc = 

which proves Lemma 4. 
Lemma 5: 

ziifinir)*^ [*o*-«) + 7](i;iie. 

Proof of Lemma 5: 
We shall prove that 

II*. II ^ [*0B-«)+7]||*|| 

for n = 0, 1, ■•■, M. 

By definition 

|| S„ || = || /(a. , nh) - 1(y n , nh) - |( a + /*)(*. - y n ) ||. 
Letg(a) = /[a.r„ + (1 — a)?/„, nft] forae [0, l].Then 

|J = /'K + (1 - a)y n , nh](x n - y n ) 

in which f[ax n + (1 — a)y„ , nh] denotes the Jacobian matrix of 
f(x, t), evaluated at x = ax n + (1 — a)y nj t = nh. Now, since q(l) — 
g(0) = f{x n , nh) - f(y n , nh),we have 



f fa da = '(*•'"*> ~KVn,nh). 
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Therefore, 
|| s n || = [ \f[ax n + (1 - a)y n , nh] - (a + /3)L V ) da(x„ - y n ) 

I I •'0 

in which ljv denotes the identity matrix of order N. 

For H an arbitrary N X N matrix, let || H \\ = (largest eigenvalue of 
H*H)*, in which H* denotes the complex-conjugate transpose of H 
(i.e., let || H || denote the "spectral norm" of H). Then 



8 n || ^ \\ f {f[ax n + (1 - a)y n ,nh] - K« + PM 

I ■'0 



da 



x n - y n 



(28) 



^ f II f[ax n + (1 - a)y n , nh] - i(a + (3)1 N 

Jo 



da 



With {/'}s and {f} A , respectively, the symmetric and antisymmetric 
parts of /', we have 

II /' - }(« + fi)U II ^ II l/'l- - *(« + 01* II + II f/'U II- 

For each a £ [0, 1], there exists 7 an orthogonal matrix T x such that 
T x {f) 8 Tt i = D = diag (f i , f 2 , ■ ■ ■ fa) in which, since f,- is an eigen- 
value of {f}s , 

a S ti £ 

for j = 1,2, ■■■ ,N. Using || T x \\ = || 77 1 II = 1, 

|| {/'}, - }(a + /3)l.v || = || ri" 1 OT, - }(a + fiT7 l T t || 

^ II 2Y' d - u<* + p)t; 1 ih|7\ h 
^ II t; 1 [HI £ - K« + 0i» W'WTt II 

^ max | f, - K« + 0) 



£*0-«). 



(29) 



Thus, || f - K« + fl)ljr II ^ *G8 - a) + || {/' U II- 

Consider || {f} A ||. For each a £ [0, 1] there exists 7 an orthogonal 
matrix T 2 such that T 2 {f) A T~ l = 5 is a direct sum of 2 X 2 block 
matrices of the form 



B i = 



6y 

l-b, 0J 



ACCURACY OF NUMERICAL SOLUTIONS 



1259 



and, if N is odd, and "1 X 1 matrix" containing the zero element. 
Clearly, 



BIB, = 



b) 

LO &,J 

S*S is a diagonal matrix, and its largest element is not greater than y 2 . 
That is, 

II ifu || = nav'flTi || ^ || s || g 7 , 

and consequently 

|| /' - J(« + 13)1* || S *# - «) + 7 (30) 

for all a e [0, 1]. Finally, from (28) and (30) 

\\ S n || ^ [|(|8 -a) +7] ||c„ ||. 
At this point we have proved that with P as defined in Section 2.3 



£ 



V„-k<Pk-\ 



2 \j / M 



(31) 



We now need the following result. 
Lemma 6: 



2_j Vn-k<Pk-\ 



> 



:- mm | F(e ia ) | 2 £ II *»*-i 



Proo/ o/ Lemma 6: 

Consider S = Sn'o II £*-o »•-««* II". with the c*'s N-vectors. Choose 
Cm+i , c, w+2 , • • • so that 

2>-»ft = 0, n ^ (il/ + 1). 

A- = n 

This is possible since v j* [v = lim F(*)]. Then 



n = U I A = fl 



i r 2 ' I * 

-7T Jo || „ = i, t-0 

Under the assumption that 



'»— *c* 



(32) 



rfw. 



2 II c * II 2 < °° - 

A-0 
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we can write 

2 «"*'"" JLv„- k c k = 52e~ iun v n E«"'"^ (33) 

n = fc-0 n=0 t-0 

in which the last sum over k is interpreted as the usual limit in the 
mean. From (32) and (33) 



S 



1 r 2r 1 1 °° ~ 

= «- / 2- e y « 2- e c * 

«1T Jo II n = i=0 

1 »2r II oo 

F(e'-)\-'tf Z «-'«. 

•^T Jo I I *-0 



^ mm 

0fiuS2 



^ min |F(e {u ) |" 2 £ || c 4 

OSoiS2t *=" 

it/ 

^ min | F(e € ") |~ 2 Z ll c * 



We now prove that 



Let 



z.n*ir< »■ 



U = Z»n-A , n ^ 0. 



(34) 



Of course: q n = for n S (M + 1) . 

Let K be an arbitrary positive integer. Then 

K n K 

n=0 t=0 *=0 

With 

(c*)jc = c* , k ^ K 
= 0, /c > K 

we obtain 

Ee _, '" u X)w„- t (c t )jr - Z« '"" £»-*(c*)k - Z)e '""$„ . 

„~0 t-0 iv + 1 1=0 n = 

Therefore, 

£;«-«■* - [i - £ [a* - k« + e)hb*K ilh+1) *\ 

*=o L t=-i J 

■ E •-'- E »-*(<*)* + W") Z e-'""ff, . (35) 

A'+l * = n-0 
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&IC J I) II »=0 



f/co, 



from which we obtain 

A" 



E II c* IP ^ max \F(e iu ) | 2 £ II ?» II" 

* = 0SuS2ir n = 



(36) 



g max \F(e iu ) | 2 £ || q n || 2 . 

0SuS2t n-0 

Since (36) holds for all K > 0, we have completed the proof of Lemma 
6* 
Inequality (31) and Lemma 6 prove Theorem 1. 

Proof of Theorem 2: 

By Lemma 3 [inequality (27)], Lemma 4, and Lemma 5 of the 
preceding proof, we have 



e n 



SpZ 



' MI'V-MZ 



2j v n - k <p k -i 



»\i 



(37) 



Furthermore, by essentially the same argument used to prove Lemma 
4, we find that 



I It II n 

IE E^-jhp*-! 

\n = I I t-0 

Therefore, with p < 1, 



=2 max |F(0 l-MEll^-i " 

0£wS2t \*-0 



(2 



^ (1 — p) max 

0SuS2x 



I M 

Fie^rlZ 



fk-l 



We now prove that sup || <p n - y || < °° implies sup H e B || < oo . Assume 

n£0 n&0 



♦Alternatively, we can show using (35), that only a finite number of c* are 
nonzero. 
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that sup || <p n -x || < oo. Let 



k = 



Then we have 

e n = 2 w n . k 8 k + u„ , n = 0, 1, 2, • • • , M (38) 

fe-0 

b k = Kx k , m - f( yt , kh) - |(« + fifa - yd ( 39 ) 

in which, since {v k \ e l t , sup \\u n \\ < °o. 

n>0 

There exist positive constants d and c 2 such that | w n | ^ c x e 
for all n ^ 0. By continuity, since p < 1, there exists a o- £ (0, c 2 ) such 
that 

p, ^ max | W(fi"~) | [|(/3 - a) + 7] < 1 

in which of course 

W(e"~ ) = Z im"""""*- 
From (38) and (39) : 

e„ = E *»»-* h+%n, n = 0, 1, 2, • • • , M 

lc-0 

l„ = j(£ k , kh) - i(y k , kh) - }(a + $)(£* ~ ft) 

where e n = e' n e n , tf>„ = e an w n , u n = e'"u„ , 8 k = e" 8 k , £ k = e" x k ,y k = 
e ak y k , and j{q, kh) = e' k f(e~' k q, kh) for all AT-vectors q. 

The Jacobian matrix /' of / is related to the Jacobian matrix of / by 

1'(q, kh) = j'(e- k Q, hh) 

from which we see that f satisfies the assumption concerning f relative 
to the numbers a, 0, and 7. Therefore, by the proof of Theorem 1, 

Ell*. In *(* -aTHEIK 

for all M ^ (p + 1). 
Now, 
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^sup || u„ || \— — ) 

nio v e — I ' 



r(.W+l) 



^ sup 1 1 w„ 



'"^ II "■•■ II , 2<r ,\J 

nsn (e - ly 



and so, 



r.o / (e — 1)* nio 



(40) 



for all M ^ (p + li. 

From (38) : 



e M ^ 



zl V>M-k 8 k 



+ sup 1 1 u„ 1 1 . 



(41) 



We shall now use (40) to bound the first term on the right side of (41 ) . 
Using the Schwarz inequality, 



£ W M -h 8 k 



= e 



£ "tots-k 8k 



< "(i\w„A\t\\^ 



(42) 



By the proof of Lemma 5, 

|| 8k || ^ [103 -«)+7] ||% 
which leads to 

£ *"«-* 8 k 



t [|C8 - a) + 7](1 - P.)" 1 sup || m„ || ( £ | 0, 



Since <r « (0, c 2 ) [see the paragraph below (39)], \w„\ 2 e l x , and therefore, 



sup 

.WXp+l) 



2 w u-k S k 



< «. 



(43) 



Finally, from (41) and (43), we have sup || e„ \\ < «, which completes 
the proof of Theorem 2. 
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IV. A CONDITION FOR THE SOLVABILITY OF (2) FOR y n + 1 

The problem of solving (2) for y„ +1 is that of solving the equation 

y + hb.rKy, t) = g (44) 

for y, with g a given iV-vector. We write (44) as 

Qy = g (45) 

in which the operator Q is defined by the condition that Qv = v + 
hb_-if{v, t) for all real A 7 - vectors v. 

We prove below that (with (-,-) denoting the usual inner product of 
real iV- vectors) : 

(Qy a - Qy„ , v. - v„) ^ Ml v> ~ v» II 2 W 

\\QVa - QV!. II ^ ^ || Vu - 2/6 II (47) 

for every pair of real A 7 "- vectors y a and y b , in which k x = (1 + hb- x a) 
if b_! ^ 0, fci = 1 + hb^p if 6_i ^ 0: and fc 2 = {1 + h | 6_i | ■ 
[max (| a |, | jS |) + y]}. Since fc 2 < oo, according to a special case of 
Theorem I of Ref. 8, if fc x > (e.g., if b_i ^ and 1 + ftb_i« > 0), 
then (45) possesses exactly one solution which can be determined by an 
iteration procedure that is only slightly more complicated than the 
usual procedure 1 - 2 (which is valid only under much stronger conditions 
on h). 
To derive (46), let 

q(v) = Wa + (1 - v)v* + hb-xftWa + (1 - n)y. . fl 

for v £ [0,1]. Then 

q'(v) = (y a - yt) + hb-j'ivya + (l - n)Vh . ftfy* - y»). 

and so 

Qy a - Qy„= [ q'(v) dv 

Jo ( 48 ) 

= f {l w + hb-if'lny. + (1 - i?)2/6 , t]\(y a - y b ) d v . 

Jo 

Thus, 

(Qy a - Qy b , y a - y b ) 
= (f {i,v + hb-j'Wja + (i - 77)2/1, . *]} *i<y- - y»). (y» - y»)) 



ACCURACY OF NUMERICAL SOLUTIONS 



1265 



Va ~ Vb 



= f ({u + W-,/'[w. + (i - i)v» . t]\(ya - y»), (v. - yd) dv 

Jo 

= \\y a -y b \\ 2 

+ M>_, [ (f s [vy a + (i - v)y b , tKy a - v*), (y* - y b )) dv, 

Jo 

in which f& denotes the symmetric part of f. Thus, since the eigenvalues 
of /J are bounded from below by a, and from above by /3: 

(Qy u - Qy b , y. - y„) £ (1 + hb. ia ) \\y a - y b ||\ &-, ^ 
^ (1 +W-SJ8) ||7/ a - y b || 2 , fc_, ^ 0. 

Consider now the derivation of (47). By (48), 

II Qy a - Qy b II 

= f {Iat + hb-j'lny. + (1 - *)0» , *]} di7(y„ - y 6 ) 

g r {i.v + w-,f[i«/. + a - 1?)** , A) *p 

I I •'0 

£ /|| 1* + Kb-xV [Wo + (l - 17)2/6 , t] || dq || y a - y b ||. 

•'0 

But, with /^ the antisymmetric part of /', 
|| I, + *&-i/'[w. + (1 ~ *)y6", fl II 

sli,+ a| b -i I llf[w. + Ci - n)v» .fl II 
g 1 + A|b-, HI ft II +A| &-i HI ft II 

g 1 + fc I b-i I max (| a |, I |8 |) + A I 6_, I 7- 
Therefore, 

II Qy a - Qy» || 3 U + A I b_, I f max (I « l> I I) + ?]} II y« - ih II 

which is equivalent to (47). 
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